On-the-Fly Generalization Hierarchies for Numerical Attributes Revisited
نویسندگان
چکیده
Generalization hierarchies are frequently used in computer science, statistics, biology, bioinformatics, and other areas when less specific values are needed for data analysis. Generalization is also one of the most used disclosure control technique for anonymizing data. For numerical attributes, generalization is performed either by using existing predefined generalization hierarchies or a hierarchy-free model. Because hierarchy-free generalization is not suitable for anonymization in all possible scenarios, generalization hierarchies are of particular interest for data anonymization. Traditionally, these hierarchies were created by the data owner with help from the domain experts. But while it is feasible to construct a hierarchy of small size, the effort increases for hierarchies that have many levels. Therefore, new approaches of creating these numerical hierarchies involve their automatic/on-the-fly generation. In this paper we extend an existing method for creating on-the-fly generalization hierarchies, we present several existing information loss measures used to assess the quality of anonymized data, and we run a series of experiments that show that our new method improves over existing methods to automatically generate on-the-fly numerical generalization hierarchies.
منابع مشابه
Attribute-oriented Induction Using Domain Generalization Graphs
Howard J. Hamilton, Robert J. Hilderman, and Nick Cercone Department of Computer Science University of Regina Regina, Saskatchewan, Canada, S4S 0A2 fhamilton,hilder,[email protected] Abstract Attribute-oriented induction summarizes the information in a relational database by repeatedly replacing speci c attribute values with more general concepts according to user-de ned concept hierarchies. ...
متن کاملSoft Foundation Strengthening Effect and Structural Optimization of a New Cement Fly-ash and Gravel Pile-slab Structure
Reducing the settlements of soft foundation effectively is a critical problem of high-speed railway construction in China. The new CFG pile-slab structure composite foundation is a ground treatment technique which is applied on CFG pile foundation and pile-slab structure composite foundation. Based on the experience of constructing Beijing-Shanghai high-speed railway in China, the settlement-co...
متن کاملThe representation and inferences of hierarchies
Hierarchy is an important relationship among knowledge. We identify the basic components and the common functionalities of hierarchies, develop a new class that provides the solution, and use it in implementing the different components and capabilities of a category. First, we use it to store the generalization/specialization relationships among knowledge such as data types, polygons, and categ...
متن کاملManufactured in The Netherlands . Data Mining in Large Databases Using DomainGeneralization
Attribute-oriented generalization summarizes the information in a relational database by repeatedly replacing speciic attribute values with more general concepts according to user-deened concept hierarchies. We introduce domain generalization graphs for controlling the generalization of a set of attributes and show how they are constructed. We then present serial and parallel versions of the Mu...
متن کاملA Method for Reasoning with Structured and Continuous Attributes in the INLEN-2 Multistrategy Knowledge Discovery System
Structured attributes have domains (value sets) that are partially ordered sets, typically hierarchies. Such attributes allow knowledge discovery programs to incorporate background knowledge about hierarchical relationships among attribute values. Inductive generalization rules for structured attributes have been developed that take into consideration the type of nodes in the domain hierarchy (...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2011